Model Quantization, Inference Optimization, GGUF Format, Privacy-preserving AI

OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
arxiv.org·15h
Cache Theory
Three Solutions to Nondeterminism in AI
blog.hellas.ai·2d·
Discuss: Hacker News
🎯Performance Proofs
A small number of samples can poison LLMs of any size
dev.to·16h·
Discuss: DEV
🎵Audio ML
LoRA Explained: Faster, More Efficient Fine-Tuning with Docker
docker.com·1d
🌀Brotli Internals
RND1: Simple, Scalable AR-to-Diffusion Conversion
radicalnumerics.ai·23h·
Discuss: Hacker News
🔍Vector Forensics
SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference
arxiv.org·15h
🧠Machine Learning
LLM Optimization Notes: Memory, Compute and Inference Techniques
gaurigupta19.github.io·4d·
Discuss: Hacker News
🧮Compute Optimization
The Hidden Oracle Inside Your AI: Unveiling Data Density with Latent Space Magic by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
🧠Machine Learning
Neuro-Symbolic AI
en.wikipedia.org·4h·
Discuss: Hacker News
🔲Cellular Automata
Tool or Agent? The impact of AI in your code and in your wallet It all boils down to math again!
blog.codeminer42.com·1d
Proof Automation
Evaluating Gemini 2.5 Deep Think's math capabilities
epoch.ai·5h·
Discuss: Hacker News
🎯Performance Proofs
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management
arxiv.org·1d
🧮Prolog Parsing
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning
arxiviq.substack.com·1d·
Discuss: Substack
Incremental Computation
Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
arxiv.org·3d
🧮Kolmogorov Complexity
Show HN: Nanowakeword – Automates custom wake word model training
github.com·7h·
Discuss: Hacker News
🎙️Whisper
Real-Time Adaptive Sparsity Optimization for Edge-Deployed AI Inference Accelerators
dev.to·9h·
Discuss: DEV
🌊Streaming Compression
The key to conversational speech recognition
datasciencecentral.com·1d
🎵Audio ML
Expanding the Action Space of LLMs to Reason Beyond Language
arxiv.org·15h
🌳Context free grammars
[D] Anyone using smaller, specialized models instead of massive LLMs?
reddit.com·1d·
🎯Performance Proofs
OpenAI's inflated valuation, as I understand it
taloranderson.com·3h·
Discuss: Hacker News
🧠Intelligence Compression